SOCKS access for MS-Windows Mosaic
----------------------------------

This code provides SOCKS access for standard MS-Windows Mosaic 2.0alpha2
clients.  This is probably the only README file you need to read. 


BACKGROUND
----------

Kevin Altis <altis@ibeam.jf.intel.com> recently announced some changes
to the CERN WWW library code to provide redirection of requests from WWW
clients.  The full text of this announcement is appended here.

In summary, this allows clients compliant with the CERN library to be
pointed at an httpd daemon, which will satisfy WWW requests on their
behalf.  Win-Mosaic 2.0alpha2 is such a client. 

Thus, this mechanism can be used to provide proxied Internet access to
Win-Mosaic clients that cannot themselves connect directly to Internet
resources, because they are inside an Internet `Firewall'. 

This can be got working in two ways:

 1. You may run a proxying daemon (such as CERN's httpd 2.16beta) on
    a system which does have direct Internet access -- such as  your
    Internet  Firewall  bastion  system  itself  --  and  point your
    Win-Mosaic clients at this.  The disadvantage of  this  is  that
    you  have to run another, large piece of software (the httpd) on
    your bastion; I was not be happy to do this. 

or

 2. You  may  build  a  SOCKS-compliant version of this proxy httpd,
    which can then run on an _internal_ host, and can  still  access
    Internet resources on behalf of its Win-Mosaic clients, thus:

        Win Mosaic -> SOCKSified httpd -> sockd -> Internet
                                       or
                                       -> direct -> local 

    The  advantage  of  this  to SOCKS users is that Internet access
    remains governed by the  configuration  of  the  sockd  on  your
    Firewall  bastion.   

    (Note  that this is a scheme already pioneered by Dick St Peters
    <stpeters@spare-parts.crd.ge.com> for use with Mac-Mosaic, using
    an  earlier  version of WWW redirection mechanism.  However, the
    new WWW  Library  mechanism  makes  implementation  considerably
    simpler.)
                    

THE SOCKSIZED HTTPD
-------------------

The SOCKSification of the CERN httpd is relatively simple with SOCKS
version 2.2.  Just a question of SOCKSizing only the WWW library calls
and not those that the daemon uses for local Win-Mosaic client
connections!  (see later)

The distribution of the CERN httpd (2.16beta) included here has all the
required changes made to it.  After unpacking the distribution:

 1. cd ./WWW
    
 2. Edit  the  file  `BUILD', to set values for the two variables at
    the top:

        SOCKSLIB        set this to point to the location of
                        your ready-built SOCKS library; this
                        must be Version 4.2 (or later) built
                        with the -DSHORTENED_RBIND option.

        SOCKS_FLAGS     set this to "-DSOCKS" to enable the
                        SOCKS code, or..

                        set it to "-DSOCKS -DCLIENT_CONTROL"
                        to enable both SOCKS and control over
                        which WWW clients are allowed to use
                        your httpd proxy daemon (see below).

 3. Type `./BUILD'    


    Everything  should  now  build  for  you.  To use your SOCKSized
    httpd  you   also   need   a   configuration   file,   typically
    /etc/httpd.conf.   A simple yet adequate version of this file is
    this:
            
        suffix  *.* text/plain 7bit
        pass http:*
        pass ftp:*
        pass wais:*
        pass gopher:*
        fail news:*
        fail *


       (An aside to explain this file, if you care:

        The  `suffix'  line is to work around an annoying feature of
        the interaction between Win-Mosaic and httpd proxying.   The
        httpd  server  `types'  objects according to their suffix on
        behalf of the Win Mosaic client.  And the default  rule  for
        files  whose  names  contain a dot but which have an unknown
        suffix is to type them as `application/octet-stream',  which
        Win-Mosaic has said (in the request dialogue) that it cannot
        cope with.  The httpd proxy thus tries to  convert  this  to
        `www-present'  and  fails.   Thus, if the PC has asked for a
        file  called  `FRED.README',  you  will   see   the   server
        successfully  acquire  it  on its behalf, and then fail when
        trying to pass  it  back.   This  suffix  rules  fixes  this
        problem.  It redefines the default for files with an unknown
        suffix to be to assume that they're just text, and pass them
        back  anyway.   It's  probably  what one wants, and prevents
        lots of puzzling failures
        
        The  `pass'  lines  allow access to remote resources of each
        specified type.  The `fail' line at the end blocks access to
        any  local  resources  (ie, on the system where the httpd is
        running). 
                
        For   more   details   on  the  syntax  available  in  httpd
        configuration files, including the use of  CACHES,  see  the
        URL:
            
        <http://info.cern.ch/hypertext/WWW/Daemon/User/RuleFile.html>


 4. If you enabled the CLIENT_CONTROL code, you'll also need a file
    called /etc/httpd.yacf, which is simply a list of the names of hosts
    that you want to allow to use this proxy daemon.  An example of this
    might be:

        fred.my.dom
        anna.my.dom
        harry.my.dom

    For more explanation about this, see `CLIENT CONTROL' later. 


 5. You can now start your proxy server.  To run it, listenin on port
    80, and logging to a logfile, as root use a command like:
    
            httpd -p 80 -l logfile

    There's  also a -v command for verbose output of what its doing,
    if you hit problems and want to try to work out why. 


 6. Next you must configure your SOCKS server sockd to allow connections
    outwards from the httpd daemon.

    By  default,  the  daemon will change uid to `nobody' and gid to
    `nogroup' when retrieving a request (though you can  change  the
    values using the configuration options `UserId' and `GroupId' in
    your /etc/httpd.conf file).  Thus, this is the  `user'  who  the
    SOCKS  request  will  appear  to have come from.  Make sure that
    your sockd allows it.
        

 7. Now you must point your Win-Mosaic clients at this proxy.  The
    instructions for doing this are in Kevin Altis's announcement
    (appended).  Remember that you MUST have Win-Mosaic 2.0alpha2
    or later.


This works for us.  Let me know if it doesn't for you!  


                        I.      (14-March-94)


Ian Dunkin <imd1707@ggr.co.uk>. 



More details follow on some things:


CLIENT CONTROL
--------------

In the current httpd (2.16-beta), you can control very precisely which
groups (users/systems etc) can access local, served files, and which
files they can get at.  And when running as a proxy, you can control
which remote resources (eg, destination WWW servers) you will proxy for. 

But there is now way to control which groups can use your httpd server
proxy at _all_. 

My use of the SOCKSized httpd daemon is as a proxy for PCs running
Win-Mosaic.  Because of bandwith problems I need to control _which_ PCs
can use this proxy: I can't allow access to just any PC user on our
networks. 

This control is not possible with httpd 2.16, because it simply skips
protection checks for non-local resources (ie, ones it is simply
proxying on and not serving itself).  I've confirmed this with Ari
Luotonen at CERN.  He agrees that this control is something people want,
and says it will be provided in the version 2.17 of the httpd. 

Just to keep me going till then, I added crude and simple code for two
routines -- init_allowed_hosts() and is_host_allowed() -- to HTDaemon.c. 
init_allowed_hosts() called at daemon startup simply reads a file of
allowed hosts into memory.  is_host_allowed() is called from HTAAServ.c
to check each client connection and allow it or deny it with an
appropriate message. 

  (I originally experimented with trying to use the existing `protect'
  code to control client access, but  this  got  very  messy  as  it's
  confounded  with  request  expansion.   Thus, I opted in the end for
  this wholly independent mechanism: as I said,  this  should  not  be
  required  after  the next version of httpd, so it's not worth making
  it too fancy.)
  
If you want to use this client control code, you'll need to enable
-DCLIENT_CONTROL in WWW/BUILD, and to provide the list of allowed hosts
in /etc/httpd.yacf. 


SOCKS #defines
--------------

The SOCKS 4.2 redefinitions of the socket routines are done in
WWW/Library/Implementation/tcp.h.  Rather horridly, I made them
conditional thus:

    #if defined(SOCKS) && !defined(RULE_FILE)

RULE_FILE is defined only when building the httpd, and not when building
the Library or the Linemode browser.  Thus, the socket calls that the
daemon itself uses for connections from local clients are _not_
redefined and thus SOCKSized.  All other calls are.  (Messy eh?) Of
course we call SOCKSinit() in HTDaemon.c too. 


SGI/IRIX
--------

Two of the patches to the source simply fix build problems on SG/IRIX
systems.  I've reported these to CERN, and they'll be fixed in the next
release of httpd.



ORIGINAL ANNOUNCEMENT OF NEW WWW LIBRARY MECHANISM
--------------------------------------------------

  Date: Wed, 9 Feb 1994 14:51:08 -0800
  From: Kevin Altis <altis@ibeam.jf.intel.com>
  To: mmosaic-fire@ncsa.uiuc.edu, www-talk@www0.cern.ch
  Subject: proxy gateway service announcement/testing
  
  I'm pleased to announce the availability of a new application level proxy
  gateway service for the WWW. This is an adaptation of Tim Berners-Lee's
  gateway code used in libWWW. Lou Montulli, Ari Luotonen, and I (Kevin
  Altis) have been working together for the last month or so on this problem
  and we now have something for you to try; the NCSA folks are working the
  changes into their code as well. The proxy supports http, gopher, ftp,
  wais, and news. Full HTTP 1.0 methods are supported including POST, PUT and
  authentication.
  
  The proxy service is based on the HTTP protocol, and for clients and
  servers based on libWWW only requires a small number of changes. Eric Bina
  has already made the changes for X Mosaic 2.2. Mosaic for the Mac and
  Windows will soon also have support for the proxy service. Client and
  server writers interested in supporting the proxy service should contact me
  for more information until we get a full description up on the Web.
  
  More messages will follow on this list for discussion of TESTING this proxy
  gateway service, problems you might encounter, etc. This is not a general
  release announcement for comp.infosystems.www.
  
  CLIENTS CURRENTLY SUPPORTING PROXY: Lynx2-2, X Mosaic 2.2
  The prerelease Lynx 2-2 source files are available at
  <ftp://stat1.cc.ukans.edu/pub/lynx/lynx2-2.tar.Z> along with a lynx.cfg in
  the same directory.
  
  X Mosaic 2.2 is available at <ftp://ftp.ncsa.uiuc.edu/Mosaic>
  
  Lynx 2-2 and X Mosaic 2.2 support the proxy service through environment
  variables. The environment variables are of the form protocol_proxy and
  expect a full URL. Within my company we use the following shell script to
  launch lynx, which automatically sets the environment variables to the
  appropriate proxy machine. Note, that we aren't proxying news since that is
  always accessed from a local machine. No other setup is required on the
  client side! NOTE: substitute your real gateway server and port number for
  "somehost.intel.com:911" below.
  
  #!/bin/sh
  http_proxy=http://somehost.intel.com:911/
  ftp_proxy=http://somehost.intel.com:911/
  wais_proxy=http://somehost.intel.com:911/
  gopher_proxy=http://somehost.intel.com:911/
  export http_proxy
  export ftp_proxy
  export wais_proxy
  export gopher_proxy
  /usr/local/bin/lynx2-2
  
  
  There is also a prerelease debug version of Win Mosaic with proxy support
  at <ftp://ftp.ncsa.uiuc.edu/outgoing/jonm/proxy.zip>; a general release
  version should be available sometime next week. Setup in Win Mosaic simply
  requires the addition of the following lines in the mosaic.ini file:
  [proxy information]
  http_proxy=http://somehost.intel.com:911/
  ftp_proxy=http://somehost.intel.com:911/
  wais_proxy=http://somehost.intel.com:911/
  gopher_proxy=http://somehost.intel.com:911/
  
  PROXY SERVERS:
  There is a test proxy server running at <http://www1.cern.ch:911/>. You can
  run your own by getting the prerelease SUN4 binary of cern_httpd 2.15
  available at
  <ftp://info.cern.ch/pub/www/bin/sun4/httpd_2.15pre3-gcc-static-lresolv.Z>
  supports the new proxy method. Ari will make the cern_httpd sources
  available later this month for compilation on other Unix platforms.
  
  You can start your copy of the cern_httpd server to run as a standalone
  proxy server with a command line of:
          cern_httpd -p 911 -r httpd.conf -l access_log
  
  The httpd.conf should look like:
  pass    http:*
  pass    ftp:*
  pass    wais:*
  pass    gopher:*
  fail    news:*
  fail    *
  
  The cern_httpd can be used as a regular server at the same time, if
  needed.  Just add normal mappings and passes to the end (before
  last fail):
  
          Pass  /*  file:/document/root/*
  
  However, for testing purposes, I would suggest that you run the cern_httpd
  only as a proxy. The cern_httpd will automatically do setuid-nobody if it
  runs as root. For more information on the cern_httpd server, see the
  documentation at <http://info.cern.ch/hypertext/WWW/Daemon/Status.html>.
  
  All requests using a proxy server such as the cern_httpd are logged exactly
  the same as normal HTTP requests, except that full URLs appear in the log
  file rather than partial URLs. The uniform log format currently being
  discussed on www-talk will be used in the cern_httpd server when the format
  is finalized.
  
  ka
  
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  Kevin Altis                             2111 N.E. 25th.
  Internet Program Architect              Hillsboro, OR 97124
  Media Delivery Laboratory               Email: altis@ibeam.intel.com
  Intel Corporation JF2-58                Phone: 503-696-8788
                                          Fax: 503-696-6067
  
  
  
  
  
